Accelerating AI inferencing with external KV Cache on Managed Lustre
cloud.google.com·4h
🏗️LLM Infrastructure
Flag this post
How We Saved 70% of CPU and 60% of Memory in Refinery’s Go Code, No Rust Required.
🔬Rust Profiling
Flag this post
Rearchitecting Vector Search: A Migration from MongoDB Atlas to Qdrant
pub.towardsai.net·13h
🎯Qdrant
Flag this post
Don't give Postgres too much memory
🔮Prefetching
Flag this post
Your AI Models Aren’t Slow, but Your Data Pipeline Might Be
thenewstack.io·2h
📊Model Serving Economics
Flag this post
🧠🚀 Excited to introduce Supervised Reinforcement Learning—a framework that leverages expert trajectories to teach small LMs how to reason through hard problems ...
threadreaderapp.com·18h
🏗️LLM Infrastructure
Flag this post
How Distributed ACID Transactions Work in TiDB
pingcap.com·4h
🏗️FoundationDB
Flag this post
MIT’s Survey On Accelerators and Processors for Inference, With Peak Performance And Power Comparisons
semiengineering.com·3h
🏗️LLM Infrastructure
Flag this post
Tencent/WeKnora
github.com·18h
🔎Meilisearch
Flag this post
Show HN: GPU-accelerated sandboxes for running AI coding agents in parallel [video]
🖥GPUs
Flag this post
Examining the Future: Vertex's Earnings Outlook
nordot.app·3h
🖥GPUs
Flag this post
Andrew Shindyapin: AI’s Impact on Software Development
skmurphy.com·17h
⚡Developer Experience
Flag this post
From Lossy to Lossless Reasoning
🔤Tokenization
Flag this post
Building Up And Sanding Down
endler.dev·20h
🪄Prompt Engineering
Flag this post
Store and search logs at petabyte scale in your own infrastructure with Datadog CloudPrem
datadoghq.com·20h
🏠Self-hosting
Flag this post
Loading...Loading more...